ENTROPY, INVERTIBILITY AND VARIATIONAL CALCULUS OF THE 
ADAPTED SHIFTS ON WIENER SPACE 



ALI SULEYMAN USTUNEL 



Abstract. In this work we study the necessary and sufficient conditions for a positive random 
variable whose expectation under the Wiener measure is one, to be represented as the Radon- 
Nikodym derivative of the image of the Wiener measure under an adapted perturbation of identity 
with the help of the associated innovation process. We prove that the innovation conjecture holds if 
and only if the original process is almost surely invertible. We also give variational characterizations 
of the invertibility of the perturbations of identity and the representability of a positive random 
variable whose total mass is equal to unity. We prove in particular that an adapted perturbation 
of identity U = Iw + u satisfying the Girsanov theorem, is invertible if and only if the kinetic 
energy of u is equal to the entropy of the measure induced with the action of U on the Wiener 
measure fi, in other words U is invertible iff 



1 f 2 f dUfi dUix 

- / \u\ H dfi= / —— log^— <f/i. 
^ Jw Jw 



2 Jw Jw dfj, dfj, 

The relations with the Monge-Kantorovitch measure transportation are also studied. An applica- 
tion of these results to a variational problem related to large deviations is also given. 
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1. Introduction 

This paper is devoted to the study of the following question: assume that (W, H, /x) is the classical 
Wiener space, i.e., W = Co([0, 1],IR ), H is the corresponding Cameron-Martin space consisting of 
the absolutely continuous, ]R d -valued functions on [0, 1] with square integrable derivatives. Assume 
that L is a strictly positive random variable whose expectation with respect to fi is one. We suppose 
that there exits a map U : W — ► W of the form U = Iw + u, with u : W — > H such that u is adapted 
to the filtration of the Wiener space and that L is represented by U, i.e. 

dU/j, _ 
d/j, 

i 
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We suppose also that 



E[p{-5u)] = 1 , 



where 



p(—5u) = exp 




Then Up is equivalent to p and the corresponding Radon-Nikodym derivative L can be represented 
as an exponential martingale p(—5v) where v : W — > H satisfies similar properties as those satisfied 
by u. The question we adress is: what are the relations satisfied by the couple For instance, 

if U and V — Iw +v are inverse to each other then the situation described above happens. However, 
due to the celebrated example of Tsirelson (cf. [20]), we know that this is not the only case. We 
concentrate ourselves particularly to this case with the help of associated innovation processes, in 
terms of which we give necessary and sufficient conditions for the representability (cf. [6) of a 
strictly positive density and for the invertibility of the associated perturbation of identity. The 
innovation approach leads to a nice result which characterizes the invertibility of an adapted shift in 
terms of the relative entropy of the measure which it induces. Namely, assume that U = Iw + u as 
above, then it is invertible if and only if the relative entropy H(U p\p) is equal to the kinetic energy 
of u, i.e., 



In Physics the notion of entropy is an indication for the number of accessible states; here it is a 
remarkable fact that the relative entropy behaves as the physical entropy in the sense that if the 
system has just enough kinetic energy to fulfill the accessible states, i.e., if this energy is equal to 
the relative entropy of the probability distribution that it creates then the mapping is invertible. 
Besides, in general it is always larger or equal to the latter. 

We apply this considerations to the innovation problem of the filtering. Namely it is a celebrated 
question whether the sigma algebra generated by the observation process is equal to that of the 
innovation process. The case the signal is independent of the noise has been solved in p], here we 
solve this problem in terms of the entropy of the observed system. 

If we represent a density of the form L = p{—5v) by U = Iw +u, then, modulo some integrability 
hypothesis, the Girsanov theorem implies that (Iw +v)oU = VoU is a Wiener process. We study 
then the properties of U o V using similar techniques. The relations with the Monge transportation 
are also exhibited. 

In the final part we use the variational methods to characterize the invertibility and representabil- 
ity of densities. As an application we give some new results for a particular case studied in [2]. 
Namely we give an explicit characterization of the solution of the minimization problem 



with the help of the entropic characterization of the invertibility explained above, where the inf is 
taken in the space of adapted, if -valued Wiener functionals with finite energy and / is a 1-convex 
Wiener functional in the Sobolev space ©2,1 (H)- 



Let W be the classical Wiener space with the Wiener measure p. The corresponding Cameron- 
Martin space is denoted by H . Recall that the injection H W is compact and its adjoint is the 
natural injection W* H* C L 2 (p). A subspace F of H is called regular if the corresponding 
orthogonal projection has a continuous extension to W, denoted again by the same letter. It is 
well-known that there exists an increasing sequence of regular subspaces (F n ,n > 1), called total, 
such that U n F n is dense in H and in W. Let 17(71"^ j3 be the cr-algebra generated by ttf„, then for 



For the notational simplicity, in the sequel we shall denote it by n> 





2. Preliminaries and notation 
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any / £ L p (fi), the martingale sequence (E[f\a(irp n )], n > 1) converges to / (strongly if p < oo) in 
L p (/i). Observe that the function /„ = E[f\a(Trp n )] can be identified with a function on the finite 
dimensional abstract Wiener space (F„,/i„,F„), where [i n = 7r„/i. 

Since the translations of fi with the elements of H induce measures equivalent to /i, the Gateaux 
derivative in H direction of the random variables is a closable operator on L p (/j,)-spaces and this 
closure will be denoted by V cf., for example [3] , [321 Q3] . The corresponding Sobolev spaces (the 
equivalence classes) of the real random variables will be denoted as D Pi fc, where k £ IN is the order 
of differentiability and p > 1 is the order of integrability. If the random variables are with values in 
some separable Hilbert space, say then we shall define similarly the corresponding Sobolev spaces 
and they are denoted as TD p k (&), p > 1, k £ N. Since V : W) p k — ► E) p fc _ 1 (i/) is a continuous and 
linear operator its adjoint is a well-defined operator which we represent by S. S coincides with the 
ltd integral of the Lebesgue density of the adapted elements of W) p k (H) (cf. [121 IT5] ) . 

For any t > and measurable / : W — > IR+ , we note by 

Ptf{x) = I f (e-*x + V~l - e-*y) fi(dy) , 

it is well-known that (Pt,t £ H+) is a hypercontractive semigroup on L p (/i),p > 1, which is called 
the Ornstein-Uhlenbeck semigroup (cf . [3l [121 H3] ) ■ Its infinitesimal generator is denoted by —C and 
we call £ the Ornstein-Uhlenbeck operator (sometimes called the number operator by the physicists). 
The norms defined by 

(2.i) u\\ p , k = \\{i + c) k i 2 $\\ LV{p) 

are equivalent to the norms defined by the iterates of the Sobolev derivative V. This observation 
permits us to identify the duals of the space D Pi fc($);p > 1, k £ IN by ID 9i _fc($'), with q^ 1 = 1— p^ 1 , 
where the latter space is defined by replacing k in (|2.1[) by — k, this gives us the distribution spaces 
on the Wiener space W (in fact we can take as k any real number). An easy calculation shows 
that, formally, S o V = C, and this permits us to extend the divergence and the derivative operators 
to the distributions as linear, continuous operators. In fact S : JD q j.(H €5 3>) — * ID q ,fc-i( ( I ) ) and 
V : ID g ,fc($) — > !D q k-i(H <8> $) continuously, for any q > 1 and k £ H, where H ® $ denotes 
the completed Hilbert-Schmidt tensor product (cf., for instance (SJ [J21 H3] ) • Finally, in the case of 
classical Wiener space, we denote by ID^ k (H) the subspace defined by 

IDp,fc(#) = e TD Ptk (H) : £ is adapted} 

forp > 1, k £ IR. 

Let us recall some facts from the convex analysis. Let K be a Hilbert space, a subset S of K x K 
is called cyclically monotone if any finite subset {(xi, yi), . . . , (a; at, Vn)} of S satisfies the following 
algebraic condition: 

(yi,x 2 - xi) + (y 2 ,x 3 — x 2 ) H h (vn-i,xn - x N -i) + {vn,x\ - x N ) < 0, 

where (■, ■) denotes the inner product of K . It turns out that S is cyclically monotone if and only if 

N 

*Y^{yi,x a {i) -x^ < o, 

i=l 

for any permutation a of {1, . . . , N} and for any finite subset {(xi,y{) : i = 1, . . . , N} of S. Note 
that S is cyclically monotone if and only if any translate of it is cyclically monotone. By a theorem 
of Rockafellar, any cyclically monotone set is contained in the graph of the subdifferential of a 
convex function in the sense of convex analysis f[10j) and even if the function may not be unique its 
subdifferential is unique. 

Let now (W,/i,H) be an abstract Wiener space; a measurable function / : W — > HU {oo} is called 
1-convex if the map 

f(x + h) + ±\h\ 2 H =F(x,h) 
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is convex on the Cameron-Martin space H with values in L° (fi) . Note that this notion is compatible 
with the /i-equivalence classes of random variables thanks to the Cameron-Martin theorem. It is 
proven in [4] that this definition is equivalent the following condition: Let (7r„,n > 1) be a sequence 
of regular, finite dimensional, orthogonal projections of H, increasing to the identity map Ih ■ Denote 
also by 7r„ its continuous extension to W and define ir^ = Iw — n n - For x € W, let x n — ir n x and 
x^ = tt„ x. Then / is 1-convex if and only if 

Xn > ~^ I ^tl \ H f (^n -^n ) 

is tt^/j,- almost surely convex. 

2.1. Preliminaries about the Monge-Kantorovitch measure transportation problem. 

Definition 1. Let £ and r\ be two probabilities on (W, B(W)). We say that a probability 7 on 
(W x W, B(W x W)) is a solution of the Monge-Kantorovitch problem associated to the couple (£, rf) 
if the first marginal of 7 is the second one is rj and if 



J( 7 ) = / \x - y&drfix, y) = inf / \x - y\ 2 H d{3(x, y) : (3 e £(£, n) 

JWxW {JWxW 

where S(^, 7]) denotes the set of all the probability measures on W x W whose first and second 
marginals are respectively £ and rj. We shall denote the Wasserstein distance between £ and r\, 
which is the positive square-root of this infimum, with du{S,, rj). 

Remark: By the weak compacteness of probability measures on W x W and the lower semi- 
continuity of the strictly convex cost function, the infimum in the definition is attained even if 
the functional J is identically infinity. In this latter case we say that the solution is degenerate. 

The next result, which is the extension of the finite dimensional version of an inequality due 
to Talagrand, [H], gives a sufficient condition for the finiteness of the Wasserstein distance in the 
case one of the measures is the Wiener measure /i and the second one is absolutely continuous with 
respect to it. We give a short proof for the sake of completeness: 

Theorem 1. Let L S L\ogL(p) be a positive random variable with E[L] = 10 and let v be the 
measure dv — LdfJ,. We then have 

(2.2) d 2 H {v,v) < 2E[L\ogL\. 

Proof: Let us remark first that we can take W as the classical Wiener space W — Co([0, 1]) and, 
using the stopping techniques of the martingale theory, we may assume that L is upper and lower 
bounded almost surely. Then a classical result of the Ito calculus implies that L can be represented 
as an exponential martingale 

t 1 nt 



L t — exp { — I ii T dW T — — J \u T \ dr 





with L — Li, where (u t , t s [0, 1]) is a measurable process adapted to the filtration of the canonical 
Wiener process (t,x) — > W t (x) — x{t). Let us define u : W — » H as u(t,x) — f Q u T {x)dr and 
U : W — > W as U(x) = x + u(x). The Girsanov theorem implies that x — > U(x) is a Browian motion 
under v, hence the image of the measure v under the map U x Iw : W — > W x W denoted by 
[3 = (U x I)v belongs to v). Let 7 be any optimal measure, then 

•%) = d 2 H (v,^)< \x-y\ 2 H df3(x,y) 

JWxW 

= E[\u\ 2 H L] 
= 2E[LlogL], 



2 In the sequel we denote the expectation w.r. to the Wiener measure by E 
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where the last equality follows also from the Girsanov theorem and the Ito stochastic calculus. □ 

The next two theorems, which explain the existence and several properties of the solutions of Monge- 
Kantorovitch problem and the transport maps have been proven in [5]. 

Theorem 2 (General case). Suppose that p and v art two probability measures on W such that 

d H (p,v) < oo. 

Let {ir n ,n > 1) be a total increasing sequence of regular projections (of H, converging to the identity 
map of H). Suppose that, for any n > 1, the regular conditional probabilities p{- \ir^ = x ) vanish 
ir^p-almost surely on the subsets of (tt^) -1 (W) with Hausdorff dimension n — 1. Then there exists 
a unique solution of the Monge-Kantorovitch problem, denoted by 7 G S(p, v) and 7 is supported by 
the graph of a Borel map T which is the solution of the Monge problem. T : W — > W is of the form 
T = I\y + £ , where £ G H almost surely. Besides we have 

4<*»> = / PW--6<M..») 

JWxW 

= / \ T ( X ) ~ x\ 2 H dp(x) , 
Jw 

and for 7r„ p-almost almost all x^ , the map u — > £(m + x„) is cyclically monotone on (7r„ ) ^{x^}, 
in the sense that 

N 

(£( X n +Ui),U l+1 - Uj) H < 

p- almost surely, for any cyclic sequence {ui, . . . ,un,un+i = ui} from n n (W). Finally, if, for 
any n > 1, n^v-almost surely, v{- |tt„ = y ) also vanishes on the n — 1-Hausdorff dimensional 
subsets of (t^) -1 (W) , then T is invertible, i.e, there exists S : W — > W of the form S = Iw + V 
such that 77 G H satisfies a similar cyclic monotononicity property as £ and that 

1 = j{(x,y)eWxW:ToS(y) = y} 
= l{{x,y) G W x W : SoT{x) = x} . 

In particular we have 

d 2 H {Piv) = I \S{y) -y\ 2 H d-y(x,y) 

JWxW 

= f \S{y)-y\ 2 H dv{y). 
Jw 

Remark 1. In particular, for all the measures p which are absolutely continuous with respect to the 
Wiener measure p, the second hypothesis is satisfied, i.e., the measure p(- |7T„ = x^) vanishes on 
the sets of Hausdorff dimension n — 1 . 

The case where one of the measures is the Wiener measure and the other is absolutely continuous 
with respect to p is the most important one for the applications. Consequently we give the related 
results separately in the following theorem where the tools of the Malliavin calculus give more 
information about the maps £ and r\ of Theorem [2j 

Theorem 3 (Gaussian case). Let v be the measure dv — Ldp, where L is a positive random variable, 
with E[L] — 1. Assume that djj(/i,^) < 00 (for instance L G LlogL). Then there exists a l-convex 
function </> G TD2,i, unique up to a constant, such that the map T = Iw + is the unique solution 
of the original problem of Monge. Moreover, its graph supports the unique solution of the Monge- 
Kantorovitch problem 7. Consequently 



{I w x T)fx = 7 



(i 
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In particular T maps /i to v and T is almost surely invertible, i.e., there exists some T 1 such that 
T~ 1 v = ji and that 

1 = /i {x : T _1 o T(x) = a;} 

= is{yeW: ToT- 1 (y)^y) . 

Remark 2. Assume that the operator V is closable with respect to v, then we have rj = Vi/>- In 
particular, if v and fi are equivalent, then we have 

T- 1 = I w + V^j, 

where is tj) is a 1- convex function, if) is called the dual potential of the MKP(fi, v) and we have the 
following relations: 

<f>{x)+^{y) + h i x-y\ 2 H >Q, 

for any x, y € W, and 

<t>{x)+^{y) + -\x-y\ 2 H = Q 

^-almost surely. 

Remark 3. Let (e n ,n £ IN) be a complete, orthonormal in H, denote by V n the sigma algebra 
generated by {5e±, . . . , Se n } and let L n — E[L\V n ]. If 4> n 6 ©2,1 * s ^ e function constructed in 
Theorem^ corresponding to L n , then, using the inequality \2. c ^l we can prove that the sequence 
((f> n , n G INT) converges to <p in ©2,1. 

3. Characterization of the invertible shifts 

Let us begin with some results of general interest. Let us first define: 

Definition 2. A measurable map T : W — > W is called (\x-) almost surely right invertible if there 
exists a measurable map S : W — > W such that S/i <C /1 and T o S = Iw u-a.s. Similarly, we say 
that it is left invertible, if T/i <C /i and if there exists a measurable map S : W — > W such that 
S o T = Iw^-a.s. 

The following proposition some parts of which are proven in [19], shows that, whenever an adapted 
shift has a left inverse almost surely, then it is almost surely invertible and its inverse is also an 
adapted perturbation of identity and it relates this concept to the existence and uniqueness of strong 
solutions of stochastic differential equations. The a 

Proposition 1. Assume A = I\y + a, a G L 2 (fi,H), a is adapted, E[p(—Sa)) = 1. Suppose that 
there exists a map B : W — > W such that B o A = I\y a.s. Then the following assertions are true: 

(i) Bfj, is equivalent to [i and A o B = Iw a.s., i.e., B is also a right inverse. 

(ii) B = Iw + b, b : W H , b is also adapted. 
(hi) (t,w) — » B t (w) is the strong solution of 

(3.3) dB t = -a t oBdt + dW t 

B = 0. 

(iv) We have 

(3.4) a t + b\oA = 

(3.5) b t + a t oB = 
dt x dfi-a.s. 

(v) In particular either the property Afi ~ [i and the relation \3.4\ together or BpL ~ /1 and the 
relation \3.5\ together imply that B o A = Ao B = Iw a.s. 
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Proof: For any / G Cb(W), it follows from the Girsanov theorem 

E[f o B] = E[f o B o A p(Sa)} 
= E[fp(-Sa)}, 

hence Bp is equivalent to p and the corresponding Radon-Nikodym density is p(—5a). Let 

D = {w G W : Bo A(w) = w} . 
Since D C A~ 1 {A{D)) and by the hypothesis p(D) — 1 we get 

E[\ A(D) oA] = 1. 

Since Ap is equivalent to p we have also p(A(D)) = 1. If w g ^(-D)? then w = A(d), for some 
d E D, hence A o £?(«;) = A o £> o A(d) — A(d) = w, consequently Ao B — Iw //-almost surely and 
B is the two-sided inverse of A. Evidently, together with the absolute continuity of Bp, this implies 
that B is of the form B — Iw + b, with b : W — > H. Moreover, a — — b o A, hence the right hand 
side is adapted. We can assume that all these processes are uni-dimensional (otherwise we proceed 
component wise). Let b = max(— n, min(6, n)). Then b o A is adapted. Let H G L 2 (dt x dp) be 
an adapted process. Using the Girsanov theorem: 
.1 

E 



E 



p(Sa) / b n s oAH s oAds 



KH„da 



E[b n s \T s ]H s ds 



p{-Sa) / E[b n s \T s ]oAH s oAds 



= E 



Consequently 

E[b n s \T s ]aA = b n s aA, 
almost surely. Since Ap is equivalent to p, it follows that 

EK\T S ] = b n s 

almost surely, hence b n and consequently b are adapted. It is now clear that (B(t), t G [0,1]) is 
a strong solution of (|3.3j) . The uniqueness follows from the fact that, any strong solution of (|3.3p 
would be a right inverse to A, since A is invertible, then this solution is equal to B. 

The proof of (v) is quite similar to that of the first part: let D — {w £ W : A o B(w) = w}, 
then p{B~ 1 (B(D)) = 1, hence B o A = Iwp-&.s. Moreover B can be written as B — Iw + b, with 
a = — b o A, proceeding as above, we show that b is adapted and the rest of the proof follows. □ 

The invertibility of A is characterized in terms of the corresponding Wick exponentials as below: 

Theorem 4. Let A = Iw + a, a G L®(p, H). Assume that E[p{-8a)] = 10 and that 

dAp 



dp. 



o Ap(—Sa) = 1 



almost surely. Then A is (almost surely) invertible. 

Proof: Since E[p(—6a)] — 1, Ap is equivalent to p, hence the corresponding Radon-Nikodym 
derivative can be expressed as an exponential martingale: 



/ = 



dAp 
dp 



exp 



-5b 



'Here we denote by 5a the stochastic integral of the adapted process a in L°(fi) 
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where b(t,w) — J* b s (w)ds, with b adapted, L \b s \ 2 ds < oo almost surely and 5b is defined in L°(p). 
The hypothesis implies that 

(3.6) 5{a + bo A) + ^\a + boA\ 2 H = 

almost surely. Define the local martingale (M t ) as 

M t = exp f- J (d s + b s o A)dW s ~\J \a s + b s o A\ 2 ds^j . 

The relation (13. 6| implies in fact that (Mt) is a uniformly integrable martingale with its final value 
(at t = 1) Mi — 1. Consequently M t = 1 almost surely for any t G [0, 1] and this implies that 

a s + b s o A = 

ds x d/i-almost surely. Hence (Iw + b) o A = Iw almost surely and the proof is fully completed 
thanks to Proposition [TJ □ 



Proposition 2. Assume that (A n ,n > 1) is a sequence of mappings of the form A n = I\y + a n , 
with a n : W — > iJ, d„ is adapted for any n and (a n ,n > 1) converges to some a in H) such 

that E[p(— Sa)] = 1. Suppose that, for any n > 1, E[p(—5a n )] = 1 and A n is invertible. If 

dA n \i 
lim — - — = / 

n— >oo d\X 

exists in the norm topology o/L 1 (/i) ; then A = Iw + a is also invertible. 

Proof: Let us denote by l n the Radon-Nikodym derivative of A n /j, with respect to /i. The hypothesis 
implies that (l n ,n > 1) is uniformly integrable. Since (a n ,n > 1) converges in probability, the 
uniform integrability, combined with the Lusin theorem implies that (l n o A n ,n > 1) converges in 
probability to I o A. Since (p(—5a n ),n > 1) converges to p(—Sa) in probability and since, by the 
invertibility of A n , we have 

l n o A n p(—Sa n ) = 1 
almost surely for any n > 1, we have also 

/ o Ap(-5a) = 1 

almost surely. The conclusion follows then from Theorem [4] □ 



The following lemma gives an important information about the Radon-Nikodym density of the 
measure Ap, with respect to p: 

Lemma 1. Assume that A = Iw + a with a G L a (p, H) with a adapted. Then 




o A E[p(—Sa)\A] < 1 



almost surely. If we have also E[p(—Sa)] — 1, then the above inequality becomes an equality: 




o A E[p(—Sa)\A] = 1 



almost surely. 
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Proof: For any positive function / G Ct,{W), using the Girsanov theorem and the Fatou Lemma, 
we have 



E[foA] = E 
> E 
= E 



dAp 



dfi 



foA^oAp(-Sa) 



foA 



dp 
dAp 
dfi 



AE[p{-5a)\A] 



which proves the first part of the lemma. For the second part, due to the integrability hypothesis, 
we can replace the inequality above by the equality and the proof follows. D 



4. Properties of non-invertible adapted perturbation of identity 
In this section we study the following concept: 

Definition 3. A positive random variable whose expectation is equal to one with respect to Wiener 
measure is said to be representable with a mapping U : W — > W if 

dUp _ 
dfi 

We begin with the following 

Proposition 3. Assume that L = p(—8v), where v £ L® a (p, H), i.e., v is adapted and J Q |z; s | 2 ds < oo 
a.s. Then there exists U = Iw + u, with u : W — > H adapted such that Ufi = Lp and E[p(— 5u)\ = 1 
if and only if the following condition is satisfied: 

(4.7) 1 = L t oU E [pi-Su^Ut] 

(4.8) = L t oU E[p{-5u)\U t ] 

almost surely for any t G [0, 1], where u l is defined as u*(r) = J^ AT ii s ds andlAt is the sigma algebra 
generated by (w(t) + u(t), t < t). 

Proof: Let U t be defined as Iw +u t , then for any / G Cb{W) which is JFj-measurable, we have 

EifoUtLtoUtpi-Su*)] = E[fL t ] 

= E[foU t }. 

Since, for any J^-measurable function G, G o Ut is Ut measurable, we get 

L t oU t E^i-Su^llt] =1- 
Conversely, it follows from the relation (|4.7[) and from the Girsanov theorem that 

E[foU]=E[faULoU p(-Su)] = E[f L] , 
a similar relation holds when we replace U by Ut- D 

Let us calculate E[p(— Su^Ut] = E[p(— 5u)\Ut] in terms of the innovation process associated to U . 
Recall that the term innovation, which originates from the filtering theory is defined as (cf.[7] and 

ft 



U t - f E[u s \U s ]ds 
Jo 



and it is a /j,-Brownian motion with respect to the filtration (Ut, t £ [0, 1]). A similar proof as the one 
in [7] shows that any martingale with respect to the filtration of U can be represented as a stochastic 



10 
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integral with respect to Z . Hence, by the positivity assumption, E[p(—Su)\U t ] can be written as an 
exponential martingale 

E\p(-8u)\Ut]=exp(-J ~\f \^\ 2 ds 

Below we give a more detailed result: 
Proposition 4. We have the following explicit result 

(4.9) E[p(-5u)\U] = exp (- J (E[u s \U s ],dZ s ) - ±J \E[u s \U s ]\ 2 ds 
hence 

(4.10) E[p(-Su)\U t ] = exp (- J (E[u s \U s ],dZ s ) - i jf \E[u s \U s }\ 2 ds 
almost surely. 

Proof: The proof follows from the double utilization of the Girsanov theorem. Let us denote by l t 
the Girsanov exponential 

r t i r t 



k = exp (- jf (E[ii s \U s ],dZ s ) - 1 J \E[u s \U s ]\ 2 ds 



On the first hand, we have, for any / 6 Cj,(W), 

E[f o Up(-Su)} - E[f] , 
and on the other hand, applying the Girsanov theorem to the decomposition 



U t = Z t + f E[u s \U s ]ds, 
Jo 



/o 

we get 

E[foUh}<E[f] = E[foUp(-6u)} 
for any positive, measurable / on W. Taking / to be Tt measurable, we conclude that 

k < E[p(-Su)\U t ] 

a.s. for any t G [0,1]. Consequently (Jt,t € [0,1]) is a uniformly integrable martingale and in 
particular E[l{\ = 1. Hence we have 

E[foUh} = E[f] =E[foU p(-Su)} , 

for any / £ Cb(W) which implies that li = E[p(—8u)\U] and the proof of f|4. 9|) follows. The relation 
(|4.10p is obvious since U t C T t ■ 

□ 

Theorem 5. A necessary and sufficient condition for the relation 7| ), that is to say for the 
representability of L = p(—5v) by U = Iw + u is that 

E[u t \Ut] = -v t oU 

dt x dp,-almost surely. 
Proof: We have 

L t o U — exp f-5v t o U t - i|t! t o U t \ 2 H 

Moreover using the identity 

5v t oU t = \ (v s o U,dW s ) + / (v s °U,u s )ds. 
Jo Jo 
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we get 

L t o U — exp — J yu s o U, dW s + ii s ds + —v s o U ds 

Substituting all these relations in (|4.7|) and using the representation (14. 9p . we obtain 
1 



L t oU E[p(-8u)\U t ] 
ft 



exp 



1 



v s o U, dW s + ii s ds + —v s o U ds 



exp (-J (E[ii s \U s },dZ s ) -i jf |£[u s |Z4]| 2 ds 



But 

/ 0Elu.lZ4l.dZ.) = / (£[ii s |W.],dW s + (ii s - S[u s |W s ])ds) . 
Jo Jo 
Consequently we get 

/ {v s aU + E[u s \U s },dW s ) = 0, 
Jo 

almost surely for any t S [0, 1] and this implies that 

E[u s \L( s ] = —i) s o U 
ds x d^-almost surely. The sufficiency is obvious. 

Corollary 1. A necessary and sufficient condition for the relation 7| ) is that that 

V oU = Z , 

in other words 



Ut = Z t v s o U ds 

Jo 

almost surely, where Z is the innovation process associated to U . 
Proof: The condition in Theorem [5] reads as 
(4.11) v t o U + E[u t \U t ] = 

almost surely. Hence 

(VoU){t) = U(t) + (voU)(t) 

= Z(t)+ [ E[u s \U s ]ds+ f v s oUds 
Jo Jo 



Z, 



by the relation (|4.1ip . 



□ 



□ 



Corollary 2. Suppose that the innovation process Z is an [Tt,t £ [0, l])-local martingale, then U 
is almost surely invertible and its inverse is V . 



Proof: We have 



U t = W t + [ u s ds = Z t + f E[u s \U s ]ds , 
Jo Jo 



hence (Wt — Z t ,t € [0,1]) is a continuous local martingale of finite variation. This implies that Z 
and W are equal hence 

u t = E[u t \U t ] , 
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dt x dp-almost surely. From Theorem [SJ it follows that u + voU = almost surely, i.e., V o U = Iw 
almost surely. It follows from Proposition Q] that 

U o V = I w 

also /i-almost surely. □ 

We can give a complete characterization of the representable random variables as follows: 

Theorem 6. Assume that L — p(—8v), V — Iw + v, v 6 L^(/Lt, 7?). Assume that U — Iw + u is 
also an adapted perturbation of identity with E[p(—5u)] = 1. Assume that V oJJ = B is a Brownian 
motion with respect to its own filtration. We have Up = L ■ fj, if and only if B is a local martingale 
with respect to the filtration generated by U and in this case B is equal to the innovation associated 
to U. 

Proof: The necessity has already been proven, for the sufficiency, note that, we have U = B — v o U. 
On the other hand we can always represent U by its innovation process as 



U t =Z t + [ E[u s \U s ]ds = B t - [ v s oUds 
Jo Jo 



where Z is the innovation process associated to U, which is a Brownian motion with respect to 
(Ut, t 6 [0, 1]). Consequently 

—v s o U = E[u s \U s ] , 

ds x d^-almost surely and the proof follows from Theorem [5l D 



:E\\v\ 2 H ] 



5. Relations with entropy 

Assume that u £ E)2 (7J) with E[p(—5u)] = 1 and let L £ ILloglL(^) be the Radon-Nikodym 
density of Up = (Iw + u)p with respect to p. Let us represent L as p(—8v). Denote E[p(—Su)\U] 
by p. Then, due to the Girsanov theorem, we have 

E[plogp) = \E[p\voU\ 2 H ] 

= ±E[p{-5u)\voU\l] 
1 

In particular, the Jensen inequality implies that 

E[\v\%] < 2E[p{-5u)logp(-6u)] 
= E[p(-5u)\u\l]. 

Proposition 5. Let P e denote the Ornstein- Uhlenbeck semigroup and denote byv e the regularization 
P e v and denote by u e the H-valued mapping which is defined as Iw + u e = (Iw + v e ) _1 whose 
existence follows from |19j . The set (u e , e > 0) has a unique weak accumulation point u £ TD2fl(H)- 
If the relation J^. holds then it satisfies the following relation: 

^-fi(fl) o Z = -E[v s o U\Z S ] = E[u s \Z s ] 
ds 

ds x dp-almost surely, where Z denotes the sigma algebra generated by the innovation Z associated 
to U. 
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Proof: From [19 , V £ = Iw + v £ is almost surely invertible and its inverse can be written as 
U e = Iw +u E . Moreover u E — —v e o U e . Hence (u E ,e > 0) is bounded in L 2 (p,H). Consequently, 
there exists a subnet which converges weakly to some u. Let £ be an H-valued, bounded continuous 
function on W. Denoting by (•, •} the duality bracket of L 2 (p,H), we get 

(u e ,0 = (u s oV e ,ZoV sP (-5v e )) 
= -{v e> €°V e p(-6v e )) 
- -(v^oVp(-Sv)). 

Hence 

{u,$ = -{v,t Vp(-5v)). 

Consequently u is unique, i.e., the net (u e ,e > 0) has only one accumulation point in the weak 
topology of E>2,o(-ff) = L 2 (p, H). From the last hypothesis 

dUp 



dp 



Hence 



(u,0 = -(v,£oVp(-6v)) 
= -{voU^oV oU) 
= -{voU^oZ) 



-E [ E[v s oU\Z s ]i s o Zds. 
Jo 



Since Z is a Brownian motion, we also have 

(u,0 = (uoZ^oZ), 

hence the proof is completed. □ 



Remark 4. We draw the attention of the reader to the fact that in general the weak convergence 
does not imply the strong convergence. The situation illustrated above is a typical example for this; 
in fact if there were also a strong convergence, then I + v would have been invertible and we would 
have Iw + u = Iw + u = {Iw + v)^ 1 (cf. [T^] ). 

Remark 5. Similarly, suppose that v is bounded and that 

(5.12) E[\u\ 2 H ]=2E[LlogL]. 

Then V — Iw + v is invertible and and its inverse is U — Iw + u with u = u. In fact this follows 
from the hypothesis \5.12\) . which implies that 

limMkli,] = \imE[\v e \%L e ] 

= E[\v\ 2 H L] 
= 2E[LlogL] 
= E[\u\%]. 

Since JD2fl(H) is a Hilbert space, the convergence of the norms implies that lim e _^o u s — u in the 
norm topology of JD2 t o(H). Therefore V is invertible as proven in |19| . Consequently, in the case 
where the mapping V is not invertible, this equality can not take place. 

The remark above suggests the following claim: 
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Theorem 7. Assume that u G D^i?), E[p(—Su)] = 1 and 

dUp 

HT = p{ - Sv) = L ' 

smc/i i/iai i; G L®(fi, H). U = Iw + u is then almost surely invertible with its inverse V — I\y + v if 
and only if 

2E[L\ogL]=E[\u\ 2 H }. 
In other words, U is invertible if and only if 

H{Un\n) = \ \\uf^ a{H) , 
where H{U \i\\x) denotes the entropy ofUfi with respect to fi. 

Proof: Since U represents Ld/j,, we have E[u s \U s ] + v s o U — ds x cfyx-alniost surely. Hence, from 
the Jensen inequality o [Z^] < £[1^1^]. Moreover the Girsanov theorem gives 

2E[L\ogL]=E[\v\ 2 H L] = E[\voU\ 2 H ]=E[[ \E[u s \U s ]\ 2 ds\ . 

Jo 

Hence the hypothesis implies that 

E[\u\ 2 H ] = E[[ \E[u s \U s ]\ 2 ds}. 
Jo 

From which we deduce that ii s = E[u s \U s ] ds x d/i-almost surely. Finally we get ii s + v s o U = 
ds x d/j,, which is a necessary and sufficient condition for the claim. The necessity is obvious. 

□ 

Remark 6. This theorem says that U is invertible if and only if the "kinetic energy" of U is equal 
to the entropy of the measure that it induces. Moreover U is non-invertible if and only if we have 

H(Ufj,\n) < i |Mlro 2 , (ff) • 

The above relation between the entropy and the (kinetic) energy can be generalized to the maps 
Iw + u, where u € L H) which do not fulfill necessarily the intcgrability condition E[p{—8u)] = 1 
as follows: 

Theorem 8. Assume that u G L 2 (/j,, H), let U = Iw + u and define L as to be 

_ dUfi 
dp 

We then have 

H{U^) = E[L\ogL] < l -E[\u\ 2 H ]. 

Proof: If \u\h G the claim is obvious from above. For the general case, let (T n ,n > 1) 

be a sequence of stopping times increasing to infinity such that \u n \n is bounded, where u n (t) = 
Jo 1[o,t„] (s)ii s ds. Denote by L n the Radon-Nikodym derivative of (Iw+u n )p, w.r.to p. From Remark 
[?], it follows that the sequence (L n ,n > 1) is uniformly integrable, hence it converges to L in the 
weak topology of From the lower semi-continuity of the entropy w.r. to this topology, we get 

E[L log L] < liminf E[L n logL n ] < lim \ E[\u n \ 2 H ] = \ E[\u\ 2 H ] . 

□ 
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6. Relations with the innovation conjecture of the filtering 

Let us briefly explain the question (cf. [2T], [TJ [7| f° r further details): Assume that we are given a 
process of the form 

Vt(w,0) = W t (w) + f h s (w,f3)ds, 
Jo 

called the observation, where j3 is independent of the Wiener path w, s — > h s (w,(3) G L 2 ([0, 1], ds) 
almost surely and adapted to some filtration in which the filtration of (Wt) can be injected. The 
question is whether the filtration of y = (yt,t G [0, 1]) is equal to the filtration of the innovation 
process defined as before: 

(6.13) v t =Vt- I E[h s \y s ]ds 

Jo 

where (y s , s G [0, 1]) is the filtration of y, called the observation process. The following result gives 
a complete answer to the innovation conjecture in the general case to which the above problem can 
be translated: 

Theorem 9. Assume that U = Lw + u is an adapted perturbation of identity such that u G D2 j o(-ff) 
and that E[p(— Su)] = 1. Define L as the Radon- Nikodym density 

_ dUn 
dfi 

and define v G LP a (p, H) as L = p(—5v). Let U = (Lit, t G [0, 1]) be its filtration eventually completed 
with /i-null sets. Let Z be the innovation process associated to U as defined above, denote by Z = 
(Z t ,t G [0, 1]) its filtration. Then U = Z if and only if there exists some u G L®(/i,H) such that 
U = Lyy + u is almost surely invertible with inverse V = Iw + v o,nd U = U o Z almost surely. 

Proof: Sufficiency: We have Z C U by the construction of Z , on the other hand the relation 

U = U o Z implies that U C Z, hence the sufficiency is proved. 

Necessity: Suppose now that Z =U, let L be the Radon-Nikodym derivative 

_ dU/j, 
dfi 

Since L > almost surely, there exists some v : W — > H such that v is adapted and that L can 
be represented as L = p(—Sv). Hence the random variable L is represented by U, this implies that 
V o U — Z almost surely, where V = Lw + v. Since U = Z, we can write U as a function of Z, i.e., 
U = U(Z). Then 

1 = n{V oU = Z} = ^{V oU(Z) = Z} 
= fi{V o U(w) — w} , 

since Zfj, = fj,. Consequently, U is a right inverse of V. Moreover Ufi= U o Z/i = Ufi ~ fi hence it 
follows from Proposition [T] that V o U = U o V = Lw /^-almost surely. □ 



Corollary 3. Assume that we are in the situation described by the relation i6.13\) . Let us denote 
by H : W — > H defined by 



H(t,y) 



E[h s \y s }ds. 



Denote by V the mapping defined by V = Lw — H . Then the filtration generated by the innovation 
v is equal to the filtration of the observation y if and only if 



E 



dV l dVp 
dp, ^ d[i 



= \e\\h 



Hi 
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Proof: It follows from Theorem^ that the invertibility of V is a necessary and sufficient condition, 
then we apply Theorem [7J □ 

Remark 7. In pQ ; the authors treat the case where the noise is independent of the signal, this 
amounts to say that u is independent of w, here on the contrary we are in a situation where the 
things are correlated. 

7. The properties of U o V 

As we have seen above, the mapping V o U preserves the Wiener measure \x. On the other hand we 
have, from the Girsanov theorem 

E[foUoVL] = E[foUoVp(-Sv)} 
= E[foU] 
= E[fL], 

for any / G Cb(W). In other words U o V preserves the measure v which is defined by dv = Ld/j,. 
Let us denote U o V with M. This mapping is of the form M — Iw + tti, where m = v + uoV is an 
adapted, _ff-valued mapping. 

Proposition 6. Assume that m satisfies the following hypothesis: 

E[p(-5m)] = 1 , 

where 5m denotes the ltd integral of (m s , s G [0, 1]) in L°((i)-sens^. Then the mapping M — U o V 
satisfies the following probabilistic Monge- Ampere equation: 

(7.14) LoM E[p(-Sm)\M] = E[L\M] , 

almost surely, where M denotes the sigma-algebra generated by M . 

Proof: From the Girsanov theorem, for any / £ Cb(W), we get 

E[f L] = E[f o ML o Mp(-5m)\ . 

On the other hand M preserves the measure dv = L dfj,, hence 

E[faML]=E[fL]. 

Therefore 

E[f oMLo M p{Sm)} = E[f o M L] , 
for any / G Cb(W) and this proves the claim. □ 

Let us denote by (Mt, t G [0, 1]) the filtration generated by M and let us suppose that m = v+uoV 
is in L 2 (p,, H) . This last hypothesis is amply sufficient to ensure the existence of the dual predictable 
projection to of m with respect to the filtration (Ait, t G [0, 1]). It can be calculated as in Proposition 

m(t) = [ E[m s \M s ]ds, t G [0, 1] . 
Jo 

Besides, the innovation process (Rt,t G [0, 1]) associated to M, defined by 

Rt = M t - f E[m s \M s ]ds 
Jo 



^This is an abuse of notation since the divergence coincides with the Ito integral only for the adapted elements of 
LP(jj,,H) with p > 1. 
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is an (M t ,t G [0, l])-Brownian motion and again from [TJ, any martingale of this filtration can 
be represented as a stochastic integral with respect to this innovation process. Consequently, the 
martingale E[p(—Sm)\M t ] can be represented as in Proposition [4j 

E\p(-Sm)\Mt]=exp(-J (E[rh s \M s ],dR s ) - ~ J \E[m s \M s ]\ 2 ds 

From the ltd representation theorem, there exists an (Ait, t G [0, l])-adapted process (jt,t G [0, 1]) 
such that Jq \^t\ 2 dt < oo almost surely and that 



E[L\M t ] — exp ^— J (%,dR s )-^J^ \j s \ 2 ds^j 



Let us calculate the terms at the right of the relation (|7.f 4|) : 

LoM = exp (-Sv oM-^\voM\% 
Using the identity 

Sv o M = S(v o M) + (v o M, m) H 
and taking into account the exponents of the relation (|7.14|) . we get 

S(v o M) + (vo M, m) H + i|u o M\ 2 H 



i 1 f i 

(E[m s \M s ],dR s ) + - / \E[tr 
1 Jo 



rh s \M s }\ ds 



(%,dR s ) + -\j\ 2 H , 



where the letters without "dot" denote the primitives of those with "dot" . If we restrict all these 
calculations to the time interval [0, i], for any t £ [0, 1], similar relation holds, consequently we have 
proven 

Theorem 10. If Up, = v = L-/i and if L = p{— Sv), whereu andv are adapted and if E[p(—5m)] = 1 
and if m = v + u o V G L 2 (p, H), then we have the following relation between v, m and 7; 

(7.15) v t o M + E[rh t \Mt\ = j t 

dt x dp-almost surely, where the scalar product is that of TR d . 

8. Relations with the Monge's transport map 

Assume that the density L is in the class L\ogL(p). It follows from [S] that there exists an 
H — 1-convex element ip of ©2,1 such that the perturbation of identity T defined as 

T(w) = w + Vip(w) 

maps the Wiener measure p to v = L ■ p and also there is another map S — Iw + VV>, ip G D 2 .i 
also H ~ 1-convex such that 

p ({w : S o T(w) — w}) = 1 

and 

v ({w : T o S(w) — w}) = 1 . 

In particular, whenever p and v are equivalent, then T and S are inverse to each other /i-almost 
surely. Let us remark that neither T nor S are adapted to the filtration (.Ft). We shall assume in the 
sequel that L is /i-almost surely strictly positive and represented as before as an exponential density 
L = p(—Sv), Let us denote by (T t , t G [0, 1]) the filtration generated by (T t , t G [0, 1]), where T t is 
defined as T t (w) — w(t) + V(f(t) with Vip(t) = L D s ipds. We have 
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Theorem 11. Assume further that L £ L 1+e (fi) for some e > 0, then T is a fi-semimartingale with 
respect to (T t ) and it has the following decomposition: 

(8 - 16) Tt - Bt + Jo l^IW 

where B = (B t ) is a (T t ) -Brownian motion. Moreover i8.16]) can be also expressed as 

rt 



o Tds . 



.17) 



T* = B, 



v s o T ds , 



where v is defined as L = p{— Sv). 

Proof: Since (Wt,t € [0, 1]) is the canonical Brownian motion, the equality T t = T _1 (JF t ) is imme- 
diate. Consequently, for any positive, measurable function /, we have the following identity: 

E[foT\%] = E v [f\F t ]oT. 

This relation implies that (T t ,t & [0, 1]) is a (fi, {Tt))- quasimartingale if and only if (W t ,t G [0, 1]) 
is a {v, (^ r t ))-quasimartingale. This latter property is immediate since V — W + v is a (v, (Ft))- 
Brownian motion and ^[1^1^] = 2E[L logL] < oo. Let us calculate the drift of (T t , t £ [0, 1]): if 9 
is a bounded, ^-measurable cylindrical function, we have, using the integration by parts formula 



-E[{T t+h -T t )6oT] = 



E[{W t+h -W t )9L] 

t+h 



-E 







h 

E [0 D t L] 
E[9E[D t L\T t ]] 

E 



D S L ds 



L 



= E 



6E[D t L\T t ] — 



as h — > 0, where L s = E[L\J- S ]. Moreover, the local martingale part is a continuous process with 
(B l , B 3 ) t — Si_jt, hence it is a Brownian motion and (T t ) has the decomposition given by the formula 
(|8.16|) which is equivalent to the decomposition given by (|8 . 1 T|) . In fact L can be represented as 



L = 1 



E[D s L\F s ]dW s 



On the other hand from the Ito's formula, we have 



L = l— v s L s dW s 
Jo 

-ElDgL^s] ds x (fyt-almost surely. 



□ 



Remark 8. We could have guessed this theorem by observing simply that the mapping B = V o T 
preserves the Wiener measure due to the Girsanov theorem. Therefore the process (t,w) — > B(w)(t) 
is a Brownian motion with respect to its own filtration. However the theorem says that it is also a 
Brownian motion with respect to the larger filtration (T t ,t G [0,1]). 

Theorem 12. Assume that L = p{—8v) satisfies the hypothesis of Theorem \lll let V = Iyy + v. 

The map V is not invertible, i.e., the equation 

(8.18) U t = W t - f v s oUds 

Jo 
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has no strong strong solution if and only if the equation 

(8.19) T t = B t - f v s oTds 

Jo 

has no strong solution. 

Proof: Assume that T is a strong solution, then by definition T should be adapted to the filtration 
of the Brownian motion B = (Bt), hence it is of the form T = T o B. Then 

1 = fi{B = f o B + v of o B} 
= n{w = T(w) + v o T(w)} 
= fi{w : V o T(w) = w} = )jl{D) , 

hence T is a right inverse to V. Moreover, for any / £ Cb(W), 

E[f o f] = E[f o f o B] = E[f o T] = E[f L] . 

Therefore T/i is equivalent to /i. Since 

^f(D) °T >Id , 

we obtain fi(T(D)) — 1 which means that T is almost surely surjective, consequently it is also a left 
inverse and it follows from Proposition [1] that T is a strong solution to the equation (|8.18[) . which 
is a contradiction. To show the sufficiency suppose that the equation (|8.18[) has a strong solution 
U, then U and V are inverse to each other almost surely, moreover B — V o T is also invertiblc 
hence U = To B^ 1 is (JT t )-adapted and this implies that T is (B _1 (J r t ))-adapted, consequently the 
equation (|8.19p has a strong solution which is a contradiction. □ 



9. VARIATIONAL TECHNIQUES FOR REPRESENTABILITY AND INVERTIBILITY 

In this section we shall derive a necessary and sufficient condition for a large class of adapted 
perturbation of identity. We begin with some technical results: 

Lemma 2. Assume that f £ 1)2,1 and r\ 6 TD^q^H) such that \tj\h G L°°(fi). Then we have 

f(w + V (w))=f(w)+ [ V v f{w + tn{w))dt 
Jo 

fi-almost surely. 

Proof: If / is Frechet differentiable or if it is H — C 1 , then the identity is obvious. Assume that 
(/„, n > 1) is a sequence of such functions converging to / in 1D 2! i and denote Iw + i] by T v . Then 
we have on the one hand 



E[\f n ° T v - f m o T v |] = E 



I fn fn 



dT v ^ 



d/i 



< E[\f n -f m \ 2 ]V 2 E 



dT v fi 
dfi 



1 1/2 
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From Lemma [TJ we have 



E 



dT, 



djt 



= E 

= E 

< E 

= E 



i 

E[p(-6r,)\T v ] 
1 



p(-5r)) 



exp ( Sn + -\rj\ H 



< oo 



since \ij\h € Hence we get that 

lim E[\f n oT v — f m o T v \] = . 

7 i ™ 

Similarly 



£ / |V,/„ - V„/ m |H o T t7? <ft = E [|V„/» - V,/ m |H / 
Jo L Jo 



< \\fn-fmhl\E 



dT trl fi 
d\i 

1 / dT t71 fi\ 



dt 



1/2 



< Wfn-fmhihJ ap(t6ri+~\Ti\ 2 H \dt 



1/2 



as n, m — > oo. 



□ 



Corollary 4. Assume that f e 102,1 is Tt -measurable for some fixed t < 1. ITien £/ie conclusion 
of Lemma\^ holds for any u € IDj q(H). 

Proof: Let (r n ) be a sequence of stopping times increasing to infinity such that |u r ™| is essentially 
bounded where u T ™ is defined as 

u T "(t) = / l[o )T „] (s)ii s ds . 
Jo 

From Lemma [2 it follows trivially that 

f(w + u T "(w)) = f(w)+ f {Vf(w + tu T "(w)),u T "{w)) H dt, 

Jo 

moreover, on the set {r„ > to}, we have f(w + u Tn (w)) = f(w + u(w)) and 

(Vf(w + tu T "(w)),u T "{w)) H = (Vf(w + tu{w)),u(w)) H 

almost surely. 

Theorem 13. Assume that v 6 IE>2 2 (-ff) such that \v\jj £ L°°(n) and that 

E[expe\\Vv\\l p ] < oo 

for some e > 0, where ||Vu|| p denotes the operator norm o/Vu. If the following infimum 



inf i^-E [|£ + « o (J w + £)| 2 ] : £ G JDl (H) 
is attained for some u, then its value is zero and U — Iw + u is inverse of the shift I\y + v. 



□ 
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Proof: The main point is to show the validity of the variational formula: 

(9.20) v(w + u(w) + r](w)) — v(w + u(w)) + / V^f (w + u(w) + trj(w))dt 



o 

almost surely where rj £ D^oi^) with \tj\h £ L°°(fi) and that these terms are properly integrable 
in such a way that the Gateaux derivative at u of F(u) is well-defined. Let us denote by v n the 
regularization of v defined as Pi/ n v, where P\/ n is the Ornstein-Uhlenbeck semigroup. Since v n is 
ii-diffcrcntiable, we get trivially the identity: 

(9.21) v n (w + u(w) + r](w)) = v n (w + u(wj) + / VriV n (w + u(w) + trj(w))dt . 

Jo 

By the Jensen inequality we have 

(9.22) sup£ [expe||Vu n || p] < oo . 

n 

Let us denote by T t the shift Iw + u + trj. Then 

El \V v v n oT t \ H dt <\\rj\\ L ^ M E || V<u n || op l t dt 
Jo Jo 

where It is the Radon-Nikodym derivative of Tt/i with respect to /i. Using the Young inequality for 

the dual convex functions exp and a; log a; we obtain, for any k > 0, 

(9.23) ||Vu n || pk <exp K \\Vv n \\ op + -lt\oglf 

K 

It is clear that, from the hypothesis and the Jensen lemma, the sequence (exp k|| Vw„||, n > 1) is 
uniformly integrable for small K > 0. From Lemma [T] 

ltoT t E\p(-5(u + tn))\T t ] < 1, 

hence 

E[l t \ogl t ] = E[logl t oT t ] 

< E[-logE[p(-S(u + tr,))\T t ]] 

< E{-logp(-5(u + tr)))} 

= lE[\u + t v \ 2 H } 

< E[\u\%]+E[\r,\%]. 

Hence (l t , t £ [0, 1]) is uniformly integrable, but we also need to prove the uniform integrability of 
(It log/j, t £ [0, 1]). For this, let A be any measurable subset of W, we have, again from Lemma [TJ 

E[l A l t log It] = E[l A o T t log k o T t ] 

= E[1 A o T t (- log E[p(-6(u + tr)))\T t ])} 

< E[l A oT t (8(u + trj) + ~\u + tn\ 2 H )] 

< E[1 A o T t S(u + tr))} + E[1 A o T t -\u + t v \ 2 H ] . 

The last two terms are equivalent, hence it suffices to show that the second terms can be chosen 
arbitrarily small by choosing fx(A) small enough. However this is obvious from the integrability of 
and from the uniform integrability of (l t ,t £ [0, 1]). From this and from the inequality (|9.22[) . 
we see that the left hand side of (|9.23p is uniformly integrable. Consequently we can pass to the limit 
in the relation (|9.2ip in L 1 ^) and obtain the relation (|9.20[) . We can now calculate the Gateaux 
derivative of F at u in any direction 77 £ ID2 o(-^) with \ij\h £ L°°(n) (instead of r] o U) as follows: 

(9.24) F(u + At?) - F(u) = E [ (u + tr] + v o (I w + u + tr]), (I H + V«) o (I w + u + trj)[rj\) H dt . 
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Let us remark that 



E[\u\ H \\Vvo(I w + u + tr))\\ op ] 

< E[\u\lY/ 2 E[\\Vvo{I w +u + t V )\\l v } 1/2 



(9.25) 
where 



< E[\u\ 2 H ] 1/2 E 



-,1/2 



and from Lemma [TJ we know that 



d(I w + u + trj)n 



1 



E[l t7hU logZ t)) , u ] < -E[\u + trj\jj] . 

Hence we can commute the expectation with the Lebesgue integral in the formula (|9.24[) . Let us 
denote the expectation of the integrand of (|9.24p by F'(u + trf)[rj\. Since v G D^fff), using the 
formula (|9.20p for Vw instead of v and the inequality (|9.25p , we see that the map t — * F' (u + trj)[rj\ 
is continuous on [0, 1]. Since u is minimal, we should have F'(u)[rj\ > for any 77 as above. Writing 
the things explicitly: 

F'(u)[rj\ = E[(u + voU,(I H + VvoU)r]) H ] 

= E[{{I H + VvoU)*{u + voU),rj) H ] 
0. 



By the invertibility of Ih + Vv, we get 



u + v o U = 



almost surely and this is equivalent to the fact that U = Iw 
other. In particular F(u) = 0. 



u and V = h 



w + v are inverse to each 
□ 



As an application of these kind of variational calculations in relation with the representability, 
consider the problem of calculation of 



inf E 



2 \a\ 2 H + f o (I w + a) 



where / : W — > H is a fixed Wiener functional. In fact, as it is shown in [2], this infimum is equal 
to — logE'lexp — /] which is also equal to 



(9.26) 



inf 



fd-r- 



^7 c?7 
— log — dn 
w dfi d[i 



where the infimum is taken w.r.to all the probability measures on (W, B(W)) and the latter is 
uniquely attained at 

1 _. 
dl ° = J e~fdfi e /i ' 
In the next theorem we shall give sufficient conditions under which it is attained: 

Theorem 14. Assume that f G 1D 2 ,i is a l-convex, bounded Wiener functional such that 

E[expe\Vf\ H ] < 00, 

for some e > 0. Then the infimum 

1 



inf E 



\<*\h + f°{Iw + a) 



a G B° (ff) 



is attained at some u G D2 (^) an d this adapted vector field satisfies the following relation: 

ut + E[D t foU\Ft}=0 
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dt x dp-almost surely, where U — lyj + u. Besides we have 
(1) 

dUp 



■J E Vll \D t j\T t \dW t ~ X - J \E Ufi [D t f\f t ]\ 2 dt 



exp 

dp 

where Ejj^ denotes the expectation with respect to the measure Up,, i.e., the image of p under 
U. 

(2) Let Vt = Eu^DtflJ-t], denote by Z the innovation process associated to U, i.e., Z t = 
Ut — Jq E[u s \U s ]ds, and define I as 



I = exp 



J E\u t \U t \dZ t - X - J \E[u t \U t ]\ 2 dt 



where hit is the sigma algebra U 1 (J~t) — cr{W s + u[s), s < t). Then E[l] = 1 and we have 



l*¥toU = lp(-6v)oU=l 
djji 



almost surely. 



Proof: Let J (a) the expectation above without inf. For A > 0, let D\ — {a £ D2 q(H) : J (a) < A}. 
Then, for sufficiently large A, D\ is a non-empty, convex set. Moreover, if (a n , n > 1) C D\ converges 
to some a in IDj q(H), then, writing A n = Iw + ct n , we have 



E 



dA n fj, dA n [i 
log 



dfi dfi 



<\E[\a n f H ]. 



Hence the sequence of Radon- Nikodym densities ( J 1 ^ , n > 1) is uniformly integrable. This prop- 
erty, combined with Lusin theorem implies that (/ o A n , n >) converges to / o A in L p (^i) for any 
p > 0, where A = I w + a. Therefore D\ is closed, since it is convex, it is also weakly closed in 
Djof-ff). This implies that a — > J (a) is weakly lower semi continuous (l.s.c). Since D\ is weakly 
compact, J attains its infimum on D\ and the convexity of J implies that this infimum is a global 
one. The scalar version of Proposition [T3l implies that 

= E[(u,a) H + (Vf°U,a) H } 

= E[(u,a) H + (ir(Vf oU),a) H ] , 

for any bounded a £ ©^(i?), where tt denotes the dual predictable projection. Hence we get 

u t + E[D t foU\T t ]=Q 

dt x dfi- almost surely. Taking the conditional expectation of this relation with respect to Ut, we 
obtain immediately 

(9.27) E[u t \Ut]+E Utt [D t f\F t ]oU = 

dt x d/i-almost surely. It is a simple calculation to see that the equation (|9.27|) implies 

lp(-Sv) oU = l 
almost surely. From the Girsanov theorem, we get 

1 = E[lp(-6v) oU}< E[p(-Sv)} , 
therefore E[p(—5v)] = 1. Similarly, for any positive, measurable g on W , we have 

E[g oU]=E[goUl p(-Sv) oU}< E[g p{-5v)} , 

therefore 

dUp 

-dp- ~ P{ - 6V) ' 
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since both are probability densities, they are equal ^-almost surely. To prove E[l] — 1 it suffices to 
write I = l/p(—Sv) o U, then 

1 



E 



and this completes the proof. 



= E 

= E 
= 1 



p(-8v) 
p(-Sv) 



o U 



1 



p(-8v) 



□ 



Remark 9. Suppose that ||V 2 /|| op < c < f almost surely, where c > is a fixed constant and the 
norm is the operator norm on H. Then the map $ : TD^ (H) — > ID^ q{H) defined by 

*(£) = -*r(V/°(/ w +0). 

where n denotes the dual predictable projection, is a strict contraction, hence there exists a unique 
u 6 ©2 q(H) which satisfies the equation 

u t + E[D t foU\Tt}=0 

dt x dfi-almost surely. 

Corollary 5. Let u G ID2 o(H) be a minimizer whose existence is assured by of Theorem ] 14\ Define 
U = Iw + u. Then 

dUp, e~f 



dp E[e~f] 



L 



if and only if U is a.s. invertible. 
Proof: Since 



J{u) = E[fL] + E[L\ogL]=E[foU] + ~E[\u\ 2 H ] 
and since by the hypothesis we have E[f L] = E[f oU], we obtain 

E[L\ogL] = ^E[\u\ 2 H }. 

On the other hand, from Theorem [Til 

E[L\ogL] = E[\ogLoU] 
= E[-logl] 
1 



E 



E[ii s \U s }\ 2 ds 



Consequently, u s = E[ii s \U s ] ds x d/i-almost surely. This implies that E[p(—Su)] = 1, hence the 
hypothesis of Theorem [7] is satisfied and the invertibility of U follows. Conversely, suppose that U 
is invertible, let M be the Radon-Nikodym density of Up w.r. ro p. Then we have 



J(u) = f Mdp + / M log Mdp 
Jw Jw 



hence Mdp — Ldp by the uniqueness of the solution of the minimization problem (|9.26l) . 



□ 
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